Goto

Collaborating Authors

 finite mixture model




Reviews: Multi-way Interacting Regression via Factorization Machines

Neural Information Processing Systems

The paper presents a Bayesian method for regression problems where the response variables can depend on multi-way combinations of the predictors. Via a hyper graph representation of the covariates interactions, the model is obtained from a refinement of the Finite Feature Model. The idea of modelling the interactions as the hyper edges of a hyper graph is interesting but the proposed model seems to be technically equivalent to the original Finite Mixture Model. Moreover, from the experimental results, it is hard to assess if the new parametrised extension is an improvement respect to the baseline. Here are few questions and comments: - is the idea of modelling interactions via an unbounded membership model new?


Combining Structural and Unstructured Data: A Topic-based Finite Mixture Model for Insurance Claim Prediction

Hou, Yanxi, Xia, Xiaolan, Gao, Guangyuan

arXiv.org Artificial Intelligence

Modeling insurance claim amounts and classifying claims into different risk levels are critical yet challenging tasks. Traditional predictive models for insurance claims often overlook the valuable information embedded in claim descriptions. This paper introduces a novel approach by developing a joint mixture model that integrates both claim descriptions and claim amounts. Our method establishes a probabilistic link between textual descriptions and loss amounts, enhancing the accuracy of claims clustering and prediction. In our proposed model, the latent topic/component indicator serves as a proxy for both the thematic content of the claim description and the component of loss distributions. Specifically, conditioned on the topic/component indicator, the claim description follows a multinomial distribution, while the claim amount follows a component loss distribution. We propose two methods for model calibration: an EM algorithm for maximum a posteriori estimates, and an MH-within-Gibbs sampler algorithm for the posterior distribution. The empirical study demonstrates that the proposed methods work effectively, providing interpretable claims clustering and prediction.


Byzantine-tolerant distributed learning of finite mixture models

Zhang, Qiong, Chen, Jiahua

arXiv.org Machine Learning

This paper proposes two split-and-conquer (SC) learning estimators for finite mixture models that are tolerant to Byzantine failures. In SC learning, individual machines obtain local estimates, which are then transmitted to a central server for aggregation. During this communication, the server may receive malicious or incorrect information from some local machines, a scenario known as Byzantine failures. While SC learning approaches have been devised to mitigate Byzantine failures in statistical models with Euclidean parameters, developing Byzantine-tolerant methods for finite mixture models with non-Euclidean parameters requires a distinct strategy. Our proposed distance-based methods are hyperparameter tuning free, unlike existing methods, and are resilient to Byzantine failures while achieving high statistical efficiency. We validate the effectiveness of our methods both theoretically and empirically via experiments on simulated and real data from machine learning applications for digit recognition. The code for the experiment can be found at https://github.com/SarahQiong/RobustSCGMM.


VICatMix: variational Bayesian clustering and variable selection for discrete biomedical data

Kirk, Paul D. W., Rao, Jackie

arXiv.org Machine Learning

Effective clustering of biomedical data is crucial in precision medicine, enabling accurate stratifiction of patients or samples. However, the growth in availability of high-dimensional categorical data, including `omics data, necessitates computationally efficient clustering algorithms. We present VICatMix, a variational Bayesian finite mixture model designed for the clustering of categorical data. The use of variational inference (VI) in its training allows the model to outperform competitors in term of efficiency, while maintaining high accuracy. VICatMix furthermore performs variable selection, enhancing its performance on high-dimensional, noisy data. The proposed model incorporates summarisation and model averaging to mitigate poor local optima in VI, allowing for improved estimation of the true number of clusters simultaneously with feature saliency. We demonstrate the performance of VICatMix with both simulated and real-world data, including applications to datasets from The Cancer Genome Atlas (TCGA), showing its use in cancer subtyping and driver gene discovery. We demonstrate VICatMix's utility in integrative cluster analysis with different `omics datasets, enabling the discovery of novel subtypes. \textbf{Availability:} VICatMix is freely available as an R package, incorporating C++ for faster computation, at \url{https://github.com/j-ackierao/VICatMix}.


Estimating the Number of Components in Finite Mixture Models via Variational Approximation

Wang, Chenyang, Yang, Yun

arXiv.org Machine Learning

This work introduces a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes, inspired by the large-sample properties of the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. Specifically, we establish matching upper and lower bounds for the ELBO without assuming conjugate priors, suggesting the consistency of model selection for FMMs based on maximizing the ELBO. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution, which tends to eliminate the extra components under model misspecification where the number of mixture components is over-specified. This stable behavior also leads to the $n^{-1/2}$ convergence rate for parameter estimation, up to a logarithmic factor, under this model overspecification. Empirical experiments are conducted to validate our theoretical findings and compare with other state-of-the-art methods for selecting the number of components in FMMs.


Clustering Change Sign Detection by Fusing Mixture Complexity

Urano, Kento, Yuki, Ryo, Yamanishi, Kenji

arXiv.org Machine Learning

This paper proposes an early detection method for cluster structural changes. Cluster structure refers to discrete structural characteristics, such as the number of clusters, when data are represented using finite mixture models, such as Gaussian mixture models. We focused on scenarios in which the cluster structure gradually changed over time. For finite mixture models, the concept of mixture complexity (MC) measures the continuous cluster size by considering the cluster proportion bias and overlap between clusters. In this paper, we propose MC fusion as an extension of MC to handle situations in which multiple mixture numbers are possible in a finite mixture model. By incorporating the fusion of multiple models, our approach accurately captured the cluster structure during transitional periods of gradual change. Moreover, we introduce a method for detecting changes in the cluster structure by examining the transition of MC fusion. We demonstrate the effectiveness of our method through empirical analysis using both artificial and real-world datasets.


Dendrogram of mixing measures: Hierarchical clustering and model selection for finite mixture models

Do, Dat, Do, Linh, McKinley, Scott A., Terhorst, Jonathan, Nguyen, XuanLong

arXiv.org Machine Learning

In modern data analysis, it is often useful to reduce the complexity of a large dataset by clustering the observations into a small and interpretable collection of subpopulations. Broadly speaking, there are two major approaches. In "model-based" clustering, the data are assumed to be generated by a (usually small) collection of simple probability distributions such as normal distributions, and clusters are inferred by fitting a probabilistic mixture model. Because of their transparent probabilistic assumptions, the statistical properties of mixture models are well-understood. In particular, if there is no model misspecification, i.e., the data truly come from a mixture distribution, then the subpopulations can be consistently estimated. Unfortunately, this appealing asymptotic guarantee is somewhat at odds with what is often observed in practice, whereby mixture models fitted to complex datasets often return an uninterpretably large number of components, many of which are quite similar to each other. The tendency of mixture models to overfit on real data leads many analysts to employ "model-free" clustering methods instead. A well-known example is hierarchical clustering, which organizes the data into a nested sequence of partitions at different resolutions. It is particularly useful for data exploration as it does not require fixing a number of subpopulations a priori and can be visualized using a dendrogram.


Parameter Identification for Partial Differential Equations with Spatiotemporal Varying Coefficients

Zhang, Guangtao, Duan, Yiting, Pan, Guanyu, Chen, Qijing, Yang, Huiyu, Zhang, Zhikun

arXiv.org Artificial Intelligence

To comprehend complex systems with multiple states, it is imperative to reveal the identity of these states by system outputs. Nevertheless, the mathematical models describing these systems often exhibit nonlinearity so that render the resolution of the parameter inverse problem from the observed spatiotemporal data a challenging endeavor. Starting from the observed data obtained from such systems, we propose a novel framework that facilitates the investigation of parameter identification for multi-state systems governed by spatiotemporal varying parametric partial differential equations. Our framework consists of two integral components: a constrained self-adaptive physics-informed neural network, encompassing a sub-network, as our methodology for parameter identification, and a finite mixture model approach to detect regions of probable parameter variations. Through our scheme, we can precisely ascertain the unknown varying parameters of the complex multi-state system, thereby accomplishing the inversion of the varying parameters. Furthermore, we have showcased the efficacy of our framework on two numerical cases: the 1D Burgers' equation with time-varying parameters and the 2D wave equation with a space-varying parameter.